As an upperclassmen, the location of my next transition in life was something I was curious about. In this document I explore popular US cities cost of living and state’s minimum wage. Through looking at cost of living and minimum wage data I explore how money is valued in certain regions.
The first dataset was scraped from Numbeo and contains cost of living information for major United States cities in 2018. It has 132 cities with two corresponding values. The first value is the ‘cost of living index’ which is an indicator of consumer goods like groceries, restaurants, transportation, and utilities. The values are indices relative to New York City. For New York City, all indices are 100. The second value is ‘rent index,’ and it works the same as the ‘cost of living index.’ If another city has a rent index of 120, this means on average that cities rent is 20% more expensive than New York City.
An additional dataset contains the latitude and longitude values for every 28,889 cities. This was only used to enable a spatial visualization of our cost of living dataset.
The second dataset was scraped from the U.S. Department of Labor and contains the minimum wage data for 51 states. For minimum wage values lower than the federal, states are allowed to set lower rates for state specific reasons like number of employees and annual gross sales.
The third dataset contains the latitude and longitude values for every 28,889 cities. This was only used to enable a spatial visualization of our cost of living dataset.
cost = cost[-1]
cost = separate(cost, City, into = c("city", "state","country"), sep = ',')
cost = cost[-3]
cost = separate(cost, state, into = c("delete", "state"), sep = ' ')
cost = cost[-2]
cost$state = abbr2state(cost$state)
convert = convert[,c(1,4,9,10)]
convert$state = convert$state_name
convert = convert[-2]
cost_map = left_join(cost,convert,by = c("city","state"))
# template
g = list(
scope = 'north america',
showland = TRUE,
landcolor = toRGB("gray95"),
subunitcolor = toRGB("gray85"),
countrycolor = toRGB("gray85"),
showlakes = TRUE,
lakecolor = toRGB("white"),
showsubunits = TRUE,
showcountries = TRUE,
resolution = 50,
projection = list(
type = 'conic conformal',
rotation = list(lon = -100)),
lonaxis = list(
showgrid = TRUE,
gridwidth = 0.5,
range = c(-140, -55),
dtick = 5),
lataxis = list(
showgrid = TRUE,
gridwidth = 0.5,
range = c(20, 60),
dtick = 5))
cost_map %>% group_by(state)%>% summarise("city number"=n())
## # A tibble: 44 x 2
## state `city number`
## <chr> <int>
## 1 Alabama 3
## 2 Alaska 1
## 3 Arizona 2
## 4 Arkansas 2
## 5 California 13
## 6 Colorado 4
## 7 Connecticut 2
## 8 District of Columbia 1
## 9 Florida 10
## 10 Georgia 3
## # … with 34 more rows
To explore areas that have a relatively high cost of living, both the ‘Cost of Living Index’ and ‘Rent Index’ were examined. The ‘Cost of Living Index’ for each major city is mapped below with the size and color of the point an output of the ‘Cost of Living Index’. The larger and more yellow the point, the greater the value of ‘Cost of Living Index’ for that city.
Cost of living cost has two significant areas of spiked prices. Expensive areas can be seen on the east and west coasts. The states of California and Florida have an especially high concentration of cities with a high ‘Cost of Living Index’; however, California and Florida do have the highest proportion of major cities from the dataset with 13 cities being from California and 10 from Florida out of 132 total cities.
# cost of living index map interactive
p1_int = plot_geo(cost_map, lat = ~lat, lon = ~lng) %>%
add_markers(text = ~paste(city, state, `Cost.of.Living.Index`), color = ~`Cost.of.Living.Index`, size = ~`Cost.of.Living.Index`, hoverinfo = "text") %>%
colorbar(title = "Cost of Living Index<br />Relative to<br />New York City<br />as 100") %>%
layout(title = 'Cost of Living Index for Major US Cities 2018<br />(Hover for city)', geo = g)
p1_int
The ‘Rent Index’ was mapped the same way as the ‘Cost of Living Index’ and the same geographical pattern was visible; however, the ‘Rent Index’ contained more extreme values than that of ‘Cost of Living.’
For further investigation, the distribution of both variables were plotted next to each other in a boxplot. Revealing that the rent can be very high in extrema cities in comparison to the median. For example, New York City is 65% more expensive for rent than that of the median. However, the difference of the cost of living in these extreme rent cities compared to the median will be much smaller. For example, New York City is 25% more expensive for cost of living than the median.
# boxplot comparing distribution of rent and cost of living index
cost_map_gath = gather(cost_map, key = index, value =value, 3:4)
a = ggplot(cost_map_gath,aes(x=index,y=value))+geom_boxplot(fill='green')+annotate("text", x = 1.5, y = 100, label = " <--- New York City --->") +annotate("text", x = 1.3, y = 90, label = "3 extrema")+annotate("text", x = 2.3, y = 90, label = "8 extrema")+labs(title='Barplot of Cost of Living vs Barplot of Rent') + scale_x_discrete(labels=c("Cost of Living", "Rent"))
ggplotly(a)
#min formatting
min = read.csv("min_wage_clean.csv", header = T)
min = min[,c(1,20)]
names(min)[1] = 'state'
min$state = str_remove_all(min$state,fixed("\n"))
min$region = tolower(min$state)
names(min)[2] = 'min. wage'
comb_map = min%>%left_join(states,by="region")
The minimum wage for each state was mapped. There is a concentration of higher minimum wages on the west and upper east coast. This fits the trend of ‘Cost of Living Index’ sans the lower east coast. Some of the Southern states (Alabama,Louisiana,etc.) do not have a state minimum wage and adopt the federal. This could cause the expense of the state to not be reflected in their minimum wage. Since minimum wage is integrated with government policy, minimum wage data may be inaccurate to judge overall wage.
# min wage map interactive
min_map = ggplot(comb_map, aes(x=long, y=lat,fill=`min. wage`,text=toupper(region))) + geom_polygon(aes(group = group)) +coord_map() +theme(axis.title.x=element_blank(),axis.title.y=element_blank())+ labs(title='Minimum Wage by State (2018)')
ggplotly(min_map)
Living on the coast is more expensive than inland. There are more cities with extreme rent costs than that of cost of living costs. This could be due to space being a limited commodity while the overall cost of consumer goods are an abundant commodity. Individuals who live in higher cost of living areas can expect to have a higher minimum wage except for the Southern-east coast.